ManifoldBoost: Stagewise Function Approximation for Fully-, Semi- and Un-supervised Learning
ثبت نشده
چکیده
We introduce a boosting framework to solve a classification problem with added manifold and ambient regularization costs. It allows for a natural extension of boosting into both semisupervised problems and unsupervised problems. The augmented cost is minimized in a greedy, stagewise functional minimization procedure as in GradientBoost. Our method provides insights into generalization issues in GradientBoost as applied to trees; these phenomena are relevant also to manifold learning. We describe a quite general framework and then discuss a specific case based on L2 TreeBoost. This framework naturally accommodates supervised learning, manifold learning, partially supervised learning and unsupervised clustering as particular cases. Multiclass learning tasks fit naturally into the framework as well. Unlike other manifold learning approaches, the family of algorithms derived has linear complexity in the number of datapoints. The performance of our method is at the state of the art on some standard problems, and exceeds the state of the art on others.
منابع مشابه
A Stagewise Least Square Loss Function for Classification
This paper presents a stagewise least square (SLS) loss function for classification. It uses a least square form within each stage to approximate a bounded monotonic nonconvex loss function in a stagewise manner. Several benefits are obtained from using the SLS loss function, such as: (i) higher generalization accuracy and better scalability than classical least square loss; (ii) improved perfo...
متن کاملApplication of three graph Laplacian based semi-supervised learning methods to protein function prediction problem
Protein function prediction is the important problem in modern biology. In this paper, the un-normalized, symmetric normalized, and random walk graph Laplacian based semi-supervised learning methods will be applied to the integrated network combined from multiple networks to predict the functions of all yeast proteins in these multiple networks. These multiple networks are network created from ...
متن کاملHypergraph and protein function prediction with gene expression data
Most network-based protein (or gene) function prediction methods are based on the assumption that the labels of two adjacent proteins in the network are likely to be the same. However, assuming the pairwise relationship between proteins or genes is not complete, the information a group of genes that show very similar patterns of expression and tend to have similar functions (i.e. the functional...
متن کاملAdaBoost and Forward Stagewise Regression are First-Order Convex Optimization Methods
Boosting methods are highly popular and effective supervised learning methods which combine weak learners into a single accurate model with good statistical performance. In this paper, we analyze two well-known boosting methods, AdaBoost and Incremental Forward Stagewise Regression (FSε), by establishing their precise connections to the Mirror Descent algorithm, which is a first-order method in...
متن کاملThe Un-normalized Graph p-Laplacian based Semi-supervised Learning Method and Speech Recognition Problem
Speech recognition is the classical problem in pattern recognition research field. However, just a few graph based machine learning methods have been applied to this classical problem. In this paper, we propose the un-normalized graph p-Laplacian semi-supervised learning methods and these methods will be applied to the speech network constructed from the MFCC speech dataset to predict the label...
متن کامل